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The  goal  of  this  research  was  to  develop  designs  for  arithmetic 
processors  with  input  and  output  data  both  in  the  Burst  format.  An  ad- 
ditional requirement  was  that  these  designs  be  constructive  from  readily 
available  logic  circuits  such  as  TTL  SSI  and  MSI  and  be  competitive  on 
both  a  cost  and  performance  basis  with  the  Block  Sum  Register  (BSR) 
techniques  first  proposed  for  Burst  Processing. 

The  result  is  a  family  of  arithmetic  processors  based  on  a  design 
for  adding  Bursts  known  as  the  Perverted  Adder  (PA).  The  PA  is  a  full 
adder  and  a  flip-flop  connected  with  the  sum  output  of  the  full  adder 
fed  back  to  the  carry  input  via  the  flip-flop.  This  simple  circuit 
performs  Burst  addition.  However,  the  output  Bursts  are  uncompacted, 
i.e.  the  ones  are  no  longer  adjacent  to  one  another,  whereas  BSR  de- 
signs output  compacted  Bursts. 

Designs  for  a  multiplier  and  divider  use  a  tree  of  perverted  adders 
to  perform  the  requisite  additions  in  parallel.  Several  modifications  of 
this  basic  PA  tree  are  also  presented.  They  permit  a  reduction  in  the 
amount  of  hardware  and  a  natural  extension  to  Burst  lengths  other  than 
the  integer  powers  of  two  that  are  so  natural  for  PA  tree  design.  Several 
designs  are  also  presented  for  performing  arithmetic  on  signed  Bursts. 
Designs  employing  both  the  sign  and  magnitude  and  a  biased  Burst 
technique  are  illustrated. 


The  designs  for  unsigned  Bursts  and  some  of  those  for  signed  Bursts 
use  no  registers  to  assemble  Bursts  before  or  after  the  processor. 
This  design  feature  makes  these  truly  on-line  processors  and  gives  them 
better  high  frequency  response  for  a  given  clock  rate  than  the  other 
logic  designs  which  have  been  proposed. 

A  PA  CPU  has  been  constructed  to  demonstrate  Burst  multiplication  and 
division  using  PA  techniques.  The  design  of  the  CPU  is  basically  a 
PA  divider  with  modifications  to  permit  the  circuit  to  function  as  a 
multiplier.  The  design  is  based  on  an  8-bit-per-Burst  format  and 
displays  the  output  in  the  form  of  the  number  of  "1"  bits  in  either  8  or  64 
time  slots  at  the  user's  option. 

A  comparison  to  other  Burst  processor  designs  shows  that  the  PA 
designs  are  less  hardware  intensive  and  are  generally  faster  than  the 
other  designs. 
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1   INTRODUCTION 
1.1  An  Explanation  of  Burst  Processing 

Burst  processing  was  first  proposed  in  1974  by  Professor  W.  J. 
Poppelbaum  (1)  as  a  PCM  transmission  and  processing  system  that  com- 
bined the  advantages  of  both  stochastics  and  weighted  binary  methods. 
The  unit  of  information  in  Burst  processing  is  the  Burst  which  con- 
sists of  a  predetermined  number  of  "bits".  These  bits  are  all  of 
equal  weight  as  opposed  to  weighted  binary  where  bit  position  deter- 
mines a  weighting  factor. 

The  equal  weighting  of  the  bits  in  a  Burst  causes  Burst  process- 
ing to  have  high  noise  tolerance  without  a  need  for  sophisticated  error 
correcting  codes  and  the  complex  hardware  associated  with  such  devices. 
This  noise  tolerance  does  not  come  cheaply.  A  ten  bit  Burst  is  able  to 
represent  11  different  values.  Only  about  3-1/2  bits  of  weighted  binary 
are  necessary  for  the  same  precision.  Another  advantage  is  that  there 
is  no  need  for  synchronization.  Because  e\tery   bit  has  the  same  weight 
you  can  start  anywhere. 

Because  Bursts  are  a  deterministic  representation  of  values, 
they  do  not  suffer  from  the  severe  problems  of  attaining  accuracy 
that  stochastic  methods  do.  In  Burst  processing  the  accuracy  is  di- 
rectly proportional  to  the  length  of  the  bit  stream,  or  number  of 
Bursts  being  observed.  In  stochastics  the  accuracy  goes  up  only  as 
the  square  root  of  the  length  of  the  bit  stream. 
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The  first  method  for  performing  arithmetic  on  Bursts  involved  a 
Block  Sum  Register  (BSR).  A  BSR  is  a  shift  register  with  each  bit 
position  of  the  register  controlling  a  current  source  (see  Figure  1). 
As  the  data  in  Burst  format  passes  through  the  BSR  it  produces  an  output 
signal  proportional  to  the  number  of  one  bits  in  the  register.  BSRs 
perform  arithmetic  in  a  quasi -analog  fashion  and  re-encode  the  result- 
ant data  to  Burst  format  for  retransmission. 

Applications  of  Burst  processing  have  been  demonstrated  by  the 
Information  Engineering  Laboratory,  under  the  direction  of  Professor 
Poppelbaum.  (2)  Many  current  projects  are  using  Perverted  Adder  (PA) 
techniques  as  developed  in  this  thesis  rather  than  BSRs  or  other  logic 
approaches  to  performing  arithmetic.  Comparisons  of  the  various  methods 
are  found  in  chapter  4. 

1.2  Burst  Encoding  Methods 

Before  data  can  be  transmitted  or  processed  in  the  Burst  format 
it  is  necessary  to  encode  it  in  that  form.  The  simplest  form  of 
encoder  is  the  ramp  encoder.  A  staircase  waveform  is  generated  and  con- 
stantly compared  to  the  analog  input.  If  the  analog  signal  is  greater 
than  the  staircase,  the  encoder  transmits  a  one.   If  it  is  lower,  a 
zero  is  transmitted.   Figure  2  depicts  this  encoding  process.  This 
simple  encoder  is  incapable  of  generating  a  Burst  sequence  with  a  long 
term  accuracy  greater  than  that  available  from  a  single  Burst  (e.g.  one 
part  in  ten). 


(A 
CD 
O 


>|o: 
x 


OH 


I 
00 

c 


2       ° 

< 


■o« 


o 


c 
o 


c 
o 
o 


d31SI03d    ldlHS 


o 

00 

2 

>- 


< 


Q- 

z 


o 

z 
< 


h-    => 

<n     a. 

CD       O 


o 
o 
o 
o 
o 
o 
o 
o 
o 
o 
o 
o 
o 
o 
o 

o 
o 


o 
o 


c 

(0 

c 
o 

•I— 

+-> 

J- 

<L> 
Ol 

o 

O 

O) 

Q. 
•r- 

(J 

C 
•r— 

$- 
Q. 

i- 

cu 
-o 
o 

(J 

c 

O) 


«3 


CM 
<D  . 

s- 

CD 


The  vernier  encoder  yields  results  which  are  more  accurate  when 
averaged  over  a  longer  period  of  time.  The  principle  is  to  take  an 
n  step  ramp  encoder  and  augment  it  with  a  second  staircase  generator 
operating  at  1/n  the  frequency  and  with  a  stepsize  of  1/n  that  of  the 
first  encoder.  This  scheme  is  capable  of .  one  percent  precision 
in  ten  Bursts  of  ten  bits  each.  Figure  3  shows  the  construction  of 
such  an  encoder  and  the  output  while  encoding  a  number  with  1%  pre- 
cision. The  output  shown  is  the  number  of  ones  in  each  Burst 
generated  by  the  encoder. 

An  encoder  developed  by  Taylor  (3)  has  superior  high  frequency 
response  and  improved  noise  characteristics.  The  complexity  of  this 
Delta  Block  Encoder  (DBE)  is  comparable  to  the  ramp  encoder.  A  sig- 
nificant difference  is  that  the  output  is  in  the  form  of  uncompacted 
Bursts  (see  section  1.3).  Uncompacted  Bursts  are  perfectly  acceptable 
to  BSR  or  PA  designs,  but  are  incompatible  with  the  other  arithmetic 
processors  discussed  in  chapter  4. 

The  DBE  is  similar  to  conventional  delta  modulation.  The  input 
signal  is  compared  to  the  encoded  version  and  the  next  output  is  chosen 
to  minimize  the  error.  The  trade-off  between  frequency  response  and 
precision  is  done  by  choosing  the  length  of  encoded  signal  used  in 
the  comparison  process. 

1.3  Definitions 

A  compacted  Burst  is  a  Burst  as  the  principle  of  Burst  processing 
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was  developed.  All  the  ones  in  a  single  Burst  are  adjacent.  In  this 
representation  the  beginning  of  a  Burst  can  easily  be  defined  as  the 
first  one  of  the  adjacent  group. 

Uncompacted  Bursts  have  ones  scattered  throughout  the  Burst 
and  thus  lose  the  sense  of  a  "burst"  of  ones.  Better  frequency 
response  is  attainable  with  uncompacted  Bursts,  but  some  of  the  cheap 
error  correcting  methods  (3)  possible  with  compacted  Bursts  are  lost. 

A  Super  Burst  is  a  group  of  n  Bursts  each  n  bits  long.  If  a  Burst 

then  can  have  a  precision  of  p,  a  Super  Burst  can  have  a  precision  of 

2 
p  .  The  vernier  encoder  of  section  1.2  is  designed  to  produce  Super 

Bursts. 

The  Bursts  that  have  been  described  thus  far  are  capable  of 
representing  only  positive  numbers.  They  will  be  referred  to  as 
unsigned  Bursts.  Most  Burst  applications  have  used  unsigned 
Bursts. 

Representation  of  negative  numbers  has  been  accomplished  in 
two  ways.  The  sign  and  magnitude  representation  uses  an  additional 
signal  to  represent  the  sign  of  the  number,  the  magnitude  is  the 
same  form  as  an  unsigned  Burst.  A  biased  Burst  represents  signed 
numbers  by  assigning  the  value  of  zero  to  a  half  filled  Burst; 
less  than  half  is  negative,  more  than  half  is  positive.  Designs 
employing  both  representations  are  presented  in  chapter  2. 


2  PERVERTED  ADDER  CIRCUITS 

2.1  Introduction 

A  new  family  of  circuits  has  been  designed  to  perform  arithmetic 
on  data  in  the  Burst  format.  This  family  is  the  result  of  a  search 
for  a  logic  design  that  is  competitive  with  the  Block  Sum  Register 
(BSR)  arithmetic  elements  that  were  first  proposed  for  Burst  arith- 
metic. 

This  family  of  circuits  is  based  on  a  simple  circuit  building 
block  which  has  been  called  the  Perverted  Adder  (PA).  The  name  was 
chosen  because  of  the  unconventional  use  of  a  full  adder  in  the 
circuit.  It  should  be  noted  that  logically  simpler  realizations  of 
a  PA  than  the  one  shown  here  are  possible.  However,  their  realiza- 
tion with  conventional  SSI  and  MSI  chips  yields  a  higher  chip  count 
than  the  circuit  described  here. 

2.2  Logic  Description  of  the  PA 

The  function  of  the  PA  is  to  accept  input  in  Burst  format  on 
two  input  lines  and  to  produce  as  an  output  the  sum,  also  in  Burst 
format.  This  sum  must  necessarily  be  scaled  by  a  factor  of  two 
to  prevent  overflow.  Table  1  is  a  state  table  (4)  which  describes 
the  PA.  Notice  that  there  are  two  inputs  "X-,"  ,  and  "X-",  two  in- 
ternal states,  and  one  output. 


Table  1:  State  table  for  the  Perverted  Adder  (PA) 
(PS  =  Present  State;  NS  =  Next  State) 


NS,Z 

PS 

x^2  =  00 

01 

11 

10 

A 
B 

A,0 
B,0 

B,0 
A.l 

A.l 
B.,1 

B,0 
A.l 

The  machine  is  in  state  "A"  when  an  even  number  of  pulses  have 
arrived  on  the  two  inputs  and  in  state  "B"  after  an  odd  number  of 

pulses.  An  output  "1"  is  generated  whenever  the  machine  goes  from 

state  "B"  to  state  "A"  or  when  two  ones  appear  at  the  inputs  simul- 

» 
taneously. 

The  combinational  equations  describing  this  machine  are: 


Z  =  x]x2  +  x^  +  x2y 


(1) 


Y  =  x^y  +  x^2y  +  x^2y  +  x-,x2y 


(2) 


7"  is  the  output  and  "Y"  is  fed  back  to  the  delay  element  yielding 
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when  the  flip-flop  passes  the  signal.  Except  for  an  interchange  of  the 
left  hand  sides,  the  above  equations  are  just  those  of  a  conventional  serial 
adder.  Figure  4  shows  the  logic  of  a  full  adder,  the  full  adder  and  a 
D  flip-flop  connected  as  a  Perverted  Adder  and  the  symbol  used  to  repre- 
sent the  PA. 

This  PA  circuit  is  all  that  is  required  to  add  two  unsigned  Bursts 
or  two  Bursts  with  biased  Burst  representation.  To  subtract  biased 
Bursts  an  inverter  is  inserted  in  the  input  lead  of  the  subtrahend. 
The  output  is  always  a  properly  scaled,  uncompacted  Burst.  Uncompacted 
Bursts  are  completely  acceptable  as  input. 

2.3  Design  and  Operational  Principles  of  a  PA  Multiplier 

A  multiplier  has  been  designed  in  which  the  multiplier,  multiplicand 
and  the  product  are  all  in  Burst  format.  It  will  be  helpful  to  refer  to 
Figure  5  during  the  discussion  of  the  operation  of  the  PA  multiplier. 

The  multiplicand  ("B"  in  Figure  5)  flows  through  the  shift  register  at 
the  top  of  the  diagram.  The  shift  register  is  exactly  the  length  of  one 
Burst,  and  so,  like  a  BSR,  always  contains  as  many  ones  as  the  value  of 
the  number  being  represented.  Notice  that  the  multiplier  ("A"  in  Figure  5) 
is  ANDed  with  a  clock  signal  and  the  output  of  this  AND  gate  is  used  to 
clock  the  Perverted  Adders.  The  top  row  of  PAs  will  produce  a  scaled 
sum  of  the  contents  of  the  shift  register.  This  sum  will  appear  in  paral- 
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Figure  4.  Logic  diagram  of  a  Full  Adder  (top),  Full  Adder  and 
Flip-flop  connected  as  a  PA  (lower  left),  Symbol  for  PA  (lower  right) 
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Figure  5.     Circuit  for  the  PA  multiplier. 
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lei  on  the  four  outputs  from  the  top  row.  These  four  lines  are  Inputs 
to  the  second  row  of  PAs  and  the  outputs  of  these  go  to  the  Inputs  of  the 
final  PA.  The  actual  addition  takes  place  only  when  a  one  is  present  on 
A  to  produce  the  clocking  signal.  The  result  is  that  B  will  be  added  into 
the  PA  tree  A  times  and  the  output  will  be  the  scaled  sum  of  these  additions. 
Each  level  of  the  PA  tree  causes  a  scaling  by  a  factor  of  2,  hence  the  factor 
of  8  for  the  multiplier  in  Figure  5. 

The  delay  through  this  circuit  is  dependent  on  the  Burst  sum  propo- 
gation.  Recalling  that  this  is  actually  the  carry  output  as  the  full  adder 
is  normally  used  and  that  that  output  is  often  faster  than  the  production  of 
the  sum,  propagation  speed  through  the  tree  is  quite  good.   Speeds  of  5  to 
8  MHz  are  possible  with  conventional  TTL  ICs. 

2.4  Design  and  Operational  Principles  of  a  PA  Divider 

A  PA  divider  can  be  constructed  by  adding  only  a  few  components  to  the 
multiplier.  Recall  from  the  discussion  of  the  PA  multiplier  that  the  PA 
tree  produces  one  output  pulse  (on  the  average)  for  eyery   8  ones  added  into 
the  top  row  of  PAs.  If  one  were  to  "load"  the  tree  with  8-n  ones,  then  just 
n  more  ones  would  produce  a  one  at  the  output.  This  would,  in  effect,  be 
performing  division  by  n  on  the  input  stream  supplying  the  additional  pulses. 
If  eyery   time  the  output  of  the  tree  produces  a  one,  we  load  the  tree  again, 
we  are  performing  a  continuous  division  by  n.  The  number  of  ones  needed  to 
load  the  tree  (8-n)  is  easily  generated  by  inverting  the  divisor  and  passing 
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it  into  a  shift  register.  An  example  should  serve  to  clarify  this  division 
algorithm.  Assume  we  wish  to  calculate  A/B  in  an  8  bit-per-Burst  system. 
A  can  be  any  number,  B  will  be  assumed  to  be  3.  Passing  B  through  an 
inverter  produces  Bursts  with  5  ones.  Begin  by  adding  the  5  ones  (in 
parallel)  into  the  PA  tree.  Begin  to  add  A  into  the  tree,  one  bit  at  a  time, 
into  one  input  to  the  top  row  of  the  PA  tree.  After  three  ones  (on  the 
average)  from  A  are  added  into  the  tree,  a  one  will  be  output.  At  this 
time  we  again  add  5  ones  (the  inverted  B)  into  the  tree  and  proceed  adding 
more  bits  from  A. 

A  divider  of  this  type  needs  a  two  phase  clock.  During  one  clock 
phase  the  inverted  divisor  is  added  into  the  tree  (when  needed),  and 
during  the  other  phase  the  dividend  is  added  into  one  input  of  one  PA 
in  the  top  row.  The  PA  tree  must  be  clocked  on  both  phases. 

It  is  possible,  at  the  expense  of  some  hardware  to  build  a  divider 
using  a  single  phase  clock.  This  divider  runs  at  speeds  comparable 
to  the  multiplier  as  it  is  necessary  to  wait  for  only  one  ripple  through 
the  tree  each  cycle  instead  of  two.  This  can  be  performed  in  either  of 
two  ways.  The  inverted  divisor  can  be  staticized  (i.e.,  held  without 
shifting  for  8  clock  cycles)  in  a  register  in  such  a  way  that  the  one 
bits  in  that  register  do  not  interfere  with  the  input  of  the  dividend. 
It  may  also  be  accomplished  by  moving  the  one  bits  (specifically  the 
one  that  falls  on  the  input  lead  reserved  for  the  dividend)  as  necessary 
with  combinational  logic.  This  latter  approach  is  used  in  the  circuits 
in  Chapter  3  and  is  illustrated  in  Figure  6.  The  combinational  logic 
shown  in  Figure  6  implements  the  following  equations: 
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Figure  6.  Circuit  for  the  PA  divider. 
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C]  =  B]  B2  B3  B4  B5  B6  B7  Bg  (3) 

C2  -  B1  +  B2  (4) 


C3  =  Bl  B2  +  B3  (5) 


C4  =  B]  B2  B3  +  B4  (6) 


C5  =  B]  B2  B3  B4  +  B5  (7) 


C6  =  Bl  B2  B3  B4  B5  +  B6  (8> 


C7  =  B-|  B2  B3  B4  B5  Bg  +  B?  (9) 


C8  =  Bl  B2  B3  B4  B5  B6  B7  +  B8  (10) 

The  quotient  in  this  divider  is  not  scaled,  therefore  overflow  occurs 
only  upon  division  by  numbers  smaller  than  1.  Division  of  any  number  by  ' 
zero  yields  Bursts  completely  filled  with  ones. 

2.5  Use  of  PA  Circuits  in  Signed  Arithmetic  Processors 

Addition  and  subtraction  of  biased  Bursts  were  covered  in  section  2.2 
and  proved  to  be  trivial  to  implement.  Multiplication  and  division  of  sign 
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and  magnitude  Bursts  is  also  very   simple.  The  only  additional  circuitry 
needed  is  an  exclusive  or  gate  to  process  the  sign  information.  The  imple- 
mentation of  a  sign  and  magnitude  adder  or  subtractor  and  of  a  biased  PA 
multiplier  are  described  in  this  section.  The  most  economical  way  to  per- 
form division  on  biased  Bursts  appears  to  involve  conversion  to  sign  and 
magnitude  and  back  again.  In  each  of  these  circuits  it  is  necessary  to 
staticize  Bursts  in  registers.  In  each  case  the  data  is  delayed  for  a 
total  of  only  one  or  two  Burst  periods. 

2.5.1  Sign  and  Magnitude  PA  Adder/Subtractor 

A  possible  design  for  a  Sign  &  Magnitude  (S&M)  PA  Adder  is  shown  in 
Figure  7.  The  principle  components  are  two  PAs  and  an  n+1  bit  bi-directional 
shift  register.  Not  shown  are  an  n  bit  parallel  load  shift  register  and  the 
clocking  circuitry  to  gate  the  result  from  one  register  to  the  other  and 
to  clear  the  bi-directional  shift  register  after  every   Burst.  This  design 
is  capable  of  accepting  compacted  or  uncompacted  Bursts  and  transmits 
compacted  Bursts. 

The  4  inputs  to  this  adder  (2  Bursts  and  two  signs)  are  duplicated 
above  each  of  the  PAs  in  Figure  7.  The  gating  between  the  input  and  the 
PAs  causes  the  left  PA  to  add  only  positive  numbers,  the  PA  on  the  right 
adds  only  negative  numbers.  The  outputs  of  the  PAs  are  used  to  control 
the  shifting  of  the  bi-directional  shift  register.  The  shift  register 
is  end  connected  via  inverters.  This  ensures  that  the  number  of  ones  in 
the  register  corresponds  to  the  magnitude  regardless  of  the  arrival  times 
of  positive  or  negative  addend  or  augend.  The  left  most  bit  of  the  re- 
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Figure  7.  A  design  for  a  sign  and  magnitude  adder. 
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gister  always  contains  the  sign  bit  according  to  the  convention  of  a 
one  for  negative.  The  eight  output  bits  are  gated  out  in  parallel,  one 
of  these  bits  being  generated  by  the  OR  of  the  two  end  bits. 

2.5.2  Biased  Representation  PA  Multiplier 

By  adding  additional  circuitry  to  the  PA  multiplier  of  Figure  5  it  is 
possible  to  multiply  biased  Bursts  directly  with  PAs.  A  mathematical 
explanation  of  the  principles  of  operation  of  this  circuit  will  simplify 
understanding  its  functioning. 

A  biased  Burst  representation  of  n  time  slots  per  Burst  will  contain 
the  number  of  ones  specified  by  the  following  equations: 

A  =  a  +  n/2  (11) 

B  =  b  +  n/2  (12) 

The  capital  letters  represent  the  number  of  "1"  bits  in  the  Burst. 
The  lower  case  letters  are  the  numeric  value  represented  and  n/2  is 
the  biasing  term. 

The  product  of  two  biased  Bursts  should  then  be 


2ab  +  n/2 

IT  (13) 

Where  "a"  and  "b"  have  been  multiplied,  scaled  and  biased.  In  terms  of 
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the  number  of  bits  in  the  incoming  Bursts  the  multiplication  function  be- 
comes : 

2AB  -  A  -  B  +  n.  (14) 

n 

The  PA  multiplier  of  Figure  5  performs  the  AB/n  part  of  this  function, 

Remembering  that  an  output  pulse  from  the  PA  multiplier  is  possible  only 

when  A  has  a  "1",  the  function  then  becomes 


AB  -  A*  -  B  +  n  (15) 

n 


where  A*  represents  the  A  pulses  that  do  not  produce  a  product  pulse  from 
the  PA  multiplier.  Looking  farther  we  see  that  -B  +  n  is  just  B,  so  we  have 

AB  -  A*  +  B  (16) 

n 

This  function  is  realized  by  the  circuit  in  Figure  8.  Not  shown  in  Figure  8 

are  a  register  to  collect  B~  left  packed  and  a  register  to  accept  and  serially 

transmit  the  data  as  well  as  the  necessary  clocking  circuitry. 

2.5.3  Representation  Converters 

No  biased  representation  divider  has  been  designed.  The  only  reasonable 
approach  would  be  to  convert  to  sign  and  magnitude  and  back  again.  Repre- 
sentation converters  are  not  yery   complex,  but  will  involve  running  the  bit 
rate  for  the  biased  representation  at  twice  that  of  the  sign  and  magnitude 
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l  =  RIGHT 
0=LEFT 

SHIFT   L/R 


Figure  8.     A  multiplier  for  biased  Bursts, 


22 


representation  in  order  to  prevent  loss  of  information.  A  sign  and  magnitude 
to  biased  converter  is  shown  in  Figure  9  and  a  converter  from  biased  to  sign 
and  magnitude  in  Figure  10. 

2.6  Modified  PA  Trees 

Several  changes  to  the  basic  PA  tree  design  are  possible  to  adapt  the 
circuit  better  to  certain  applications.  If  lower  speeds  can  be  tolerated 
in  order  to  save  hardware  the  circuit  principle  illustrated  1n  Figure  11 
can  be  used.  An  additional  advantage  is  that  the  random  noise  caused  by 
stored  bits  in  the  tree  is  decreased.  This  circuit  uses  only  one  flip-flop 
per  level  in  the  tree.  This  scheme  requires  carry  ripples  through  the 
tree  to  proceed  across  and  down.  The  optimum  method  of  carry  handling  is 
to  ripple  toward  the  center  of  the  tree  and  down. 

All  PA  trees  described  so  far  have  scaled  by  an  integer  power  of  2.  It 
is  possible  to  use  this  type  of  tree  with  noninteger  power  of  two  length 
Bursts,  but  dynamic  range  of  number  representation  is  lost.  For  example, 
ten  bit  Bursts  scaled  by  a  normal  4  level  tree,  cause  a  reduction  of  37% 
of  the  dynamic  range.  Two  such  stages  cause  a  loss  of  61%.  The  method 
shown  in  Figure  12  illustrates  a  technique  to  adjust  the  scaling  to  precisely 
the  value  needed  to  preserve  dynamic  range.  Figure  12  illustrates  the  tech 
nique  for  10  bit  Bursts  but  the  generalization  to  any  number  is  straight 
forward. 

To  modify  the  tree  to  scale  by  the  desired  factor,  the  smallest  tree  is 
constructed  which  is  capable  of  handling  the  desired  number  of  inputs.  The 


23 


00  o 


a: 

UJ 

i- 

— 
o 

UJ 


X 

CO 


o 
o 

_l 
o 


CO 

■M 
to 

s. 

3 
CO 

■o 

CD 
CO 

re 
.S 
o 


<D 

-o 


E 
C7> 

re 

E 

"O 

E 

re 

E 
D) 
•r— 
CO 


LU 

m 
O 

or 

CO 


z 

e> 

o 

< 

CO 

2 

E 


s- 

<U 

+J 

s_ 

> 

E 

o 
o 


<U 

s- 

3 


lfldNI 


z 

CO 

h- 
=> 
o. 

z 


24 


x<-> 

COW 


UJ 
CO 


o 

o 

a:-" 
coo 


_i  tc 

<  LU 

Z  H. 

O  U> 


o 

LU 


O 
LU 

o 

'1    X 
00    CO 


^    H- 


to 


rar 


IO 


o 


-\ 


CO 


> 


QC 

< 
a. 


Q_  O 
O 

CU     l    i— 


c 
o 

•r" 
-M 
fO 
-M 

C 

a> 

to 

aj 

s_ 

Q. 

a> 

s- 

aj 

"O 

CJ 
+■> 

C 

cn 

<T3 


a.  w 

r—     i- 

(/) 

c 
a>-o 

•r-    CU 

oo   to 

00 
CU  CD 
-C  o 
+J    o 

fO 

o 


-a 

c 


0.  C7) 

°      o 


to 
+J 

s- 

=J 
CO 

-o 
cu 
to 


CU 
CU  -Q 
N 

to    fO 

cu  E 

E 

O  -M 

O  'r- 

cu 

91 

-M   +-> 
•r-     IO 

.a  -r- 
en 

4_>    CU 

.c  s- 

O) 

•i-  i- 

t-    CU 
CU   o 

-C  r— 

+J 

CU 
CU   4-> 

s  o 

+-> 


o 

4- 

"O 
CU 
CU 


CU 

-C 

+J 
cu 


CU 

-o 

c 

s- 
cu 
■a 

> 

-a 

<: 

CL 
CU 
■M 

C 


to 

1) 

c: 
o 

o 
+J 


■a 

0) 
cn 


-a 
cu 

00 


"O 

CU 

+J 

N 

^~ 

»i — 

ZJ 

r— 

00 

<TJ 

a.) 

•i — 

i- 

4-> 

•i — 

CU 

C 

-C 

•I— 

+J 

y 


en 


cu 

-Q      • 

cn 

c  c 

u  +■> 

00  •!— 

+->  .c 

3    to 
CL 
+J    S- 

=j  o 
ol- 
io (-> 


to  c  cu  cn 
•r-  cu  -c  o 

rhi- 

CU  i— 

•M  •    (TJ 

tO        •  r—     C 

•r-  "O  CU    o 

cn  cu  «—  •■- 

CU  $-  f—  -M 
J-    fO    fO    fO 

cu  s-  c 

S-  f—  fO  T- 
CU  O  Q.J3 
Q-  E 

Q.  00  C  O 
3  -i-  t-    O 


-O  T- 


E 

o 

J- 

s- 
cu 
+J 

J- 
cu 
> 
c 
o 
u 


25 


<u 
u 

c 
0)  <a 
.a  •»-> 

to 

>>r: 

E 
0) 

-a  -c 
a) 

Q.  C 
CO  <U 
4-> 
4->  S- 
tO  O 
O  -C 
r—    10 

0J  O 
_C  +-> 
+■> 

«4-    O 

o  s- 

cu  -c 
E  a 
o  <o 
oo  a) 

<«- 
•  o 
<u 

i-  $- 
«o  <u 

-o  c 

s-  <u 

<o  u 

x: 

to  -C 
to  -t-> 
CD 


en 

c  a. 
•i-  o 
to  r-     • 

ca> 


OVi- 


to 

CD 
"O 

O) 

s- 
+■» 

<c 

Q_ 


01 

x: 

4-> 

a> 

c 

o 


s- 

+-> 
to 


r—  TO    Q- 

0)    Q. 

<U    C  t- 

C7>  C7>  <L> 

•r-    O)  -C 


26 


-* 

fc. 

5|2 

i 

o 

li. 

< 

ft. 

o 

i 

< 

0. 

^ 

to 

It 

3 
to 

W 

/ 

n 

0) 

s- 

to 
Ol 

2 

w~ 

"s 

to 

x: 

1 

4- 

+-> 

< 

Q_ 

01 

1        1        1 

SHIFT      REGISTER 
1              1              1 

+j 

+ 

a. 

<u 
■«-> 

< 

ft. 

<- 

o 

u.         -J 

u.         o 

a 

*- 

< 
ft. 

1 

01 

s- 
u 

o 

-M 

[ 

-► 

0. 

used 
ble. 

«- 

-> 

iques 
possi 

2 

< 
ft. 

*. 

c 

^ 

O    t~ 

k 

2 

«- 

a> 

+■>    J- 

01 

h 

u  a; 

A 

p 

< 
Q_ 

T3  •!- 

p 

01 

^ 

s1 

(0 

* 

•  >> 

t 

CD 

Si 

> 
> 

Figure 
Scalir 

27 


unused  inputs  (regardless  of  their  position  in  the  tree)  are  feedback  points 
to  preload  the  tree  in  a  manner  similar  to  that  described  in  section  2.4 
for  the  PA  divider. 

Table  2  shows  various  comparisons  for  performance  for  the  techniques  pre- 
sented in  this  section  for  various  Burst  lengths.  Hardware  complexity  and 
delay  times  are  given  for  normal  PA  trees  as  well  as  non-power-of-two  trees 
and  for  both  techniques  of  1  storage  element  per  level. 


Table  2:  Comparisons  of  PA  Trees 


Bits 

per 

Burst 

8 

10 

12 

16 

20 

Hardware  Complexity 

Full  Adders 

7 

11 

12 

15 

21 

Flip  Flops 

Normal 

7 

11 

12 

15 

21 

Modified 

3 

4 

4 

4 

5 

Delays  (ripples) 

Normal 

3 

4 

4 

4 

5 

Carry  to  End 

6 

8 

9 

11 

14 

Carry  to  Center 

5 

6 

7 

8 

10 
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2.7  Mul ti -input  PA  Adders 

Burst  addition  as  defined  with  automatic  scaling  is  not  associative; 
i  .e. 

(A  +  B)  +  C      ¥>  A  +  (B  +  C)    .  (17) 

After  calculation  the  left  side  will  be 

A  +  B  +  C 

4   4   2  (18) 

The  right  side  will  be 

A  +  B  +  C 

2  4   5"  (19) 

If  it  is  desired  to  generate 

A  +  B  +  C 

3  3   3  (20) 

this  can  be  accomplished  by  a  PA  tree  with  feedback.  Using  a  four  input 
tree  A,B  and  C  are  connected  to  three  inputs  and  feedback  is  supplied  to 
the  fourth. 

This  scheme  can  be  generalized  to  any  number  of  inputs. 


3  A  BURST  CPU  TO  DEMONSTRATE  PAs 

A  Burst  CPU  capable  of  multiplying  or  dividing  two  8-bit  Bursts 
has  been  constructed.  The  input  Bursts  have  each  bit  position  de- 
termined by  a  front  panel  switch.  The  output  is  displayed  on  a  pair 
of  seven  segment  LED  readouts.  The  counter  which  drives  the  read- 
outs is  selectable  between  8  and  64  clock  cycles  for  the  display. 
The  clock  may  be  external  or  internally  derived  by  dividing  the  power- 
line  frequency  by  1,  2,  4,  8,  or  16.  The  remainder  of  this  chapter 
is  devoted  to  a  more  detailed  description  of  this  CPU.  Figure  13  is 
a  photograph  of  this  CPU  which  is  complete  on  two  circuit  boards  plus 
the  front  panel  components.  Figure  14  is  a  block  diagram  of  the  CPU 
and  should  be  referenced  to  determine  how  the  blocks  of  circuitry  to 
be  discussed  relate  to  each  other.  Consult  Appendices  A  and  B  for 
lists  of  ICs  and  board  pin  connections  respectively. 

3.1  Clock  and  Control  Circuits 

The  clock  and  control  signal  generation  circuits  occupy  part  of 
board  1  and  can  be  seen  in  the  schematic  diagram  at  Figure  15. 
A.C.  ripple  from  the  power  supply  is  directed  to  a  one-shot  multi- 
vibrator. The  resulting  TTL  compatible  signal  is  fed  to  the  selector 
switch  and  to  a  74163  4-bit  binary  counter,  the  outputs  of  which  are 
additional  clock  signals  which  may  be  selected  by  SI.  The  external 
clock  input  is  also  shown.  This  clock  signal  is  used  in  the  Burst 
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generator,  (see  Figure  17)  as  well  as  in  one  of  the  NORs  of  IC  16. 

One  of  the  by-products  of  Burst  generation  is  the  signal  labeled 
"A",  shown  as  an  input  to  the  clock  section  in  Figure  15.  The  timing 
diagram  of  Figure  16,  illustrates  the  relationship  between  the  selected 
clock  (system  clock  on  Figure  16)  and  signal  "A".  This  signal  is  used 
to  drive  a  7473  flip-flop  and  a  74163  4-bit  counter.  The  feedback  from 
QD  to  the  CLR  input  causes  the  74163  to  divide  by  9.  These  two  signals 
are  used  to  gate  the  counter  for  display  of  8  or  64  clock  cycles  respec- 
tively. The  selection  is  made  by  either  grounding  or  applying  5  volts 
to  the  display  select  input.  This  is  done  by  a  front  panel  switch. 

The  output  of  the  AOI  (IC  20)  is  the  same  as  the  signal  labeled 
"wait  light"  on  Figure  16  for  the  selection  of  the  signal  from  the 
7473.  If  the  74163  was  selected  the  signal  would  remain  low  for  64 
clock  cycles  and  high  for  8  cycles.  The  output  of  the  AOI  is  used  to 
derive  the  remaining  control  signals.  NORing  it  with  the  system  clock 
produces  the  CPU  clock.  This  signal  is  out  of  phase  with  the  system 
clock  and  is  inhibited  when  the  display  is  in  its  update  cycle  (see 
Figure  16).  "C"  and  "F"  are  signals  similar  to  "A"  but  of  different 
phases.  In  conjunction  with  the  output  of  the  AOI  these  generate  the 
STROBE  and  CLEAR  signals  for  the  counter.  The  "wait  light"  is  driven 
by  an  open  collector  inverter. 

3.2  Burst  Generators 

The  input  Bursts  to  the  CPU  are  generated  by  the  circuit  of  Figure 
17.  The  heart  of  the  Burst  generator  is  the  shift  register  and  the  NAND 
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gate  in  the  center  of  the  diagram.  These  are  connected  to  cause  the 
shift  register  to  always  contain  one  zero  and  seven  ones.  The  circuit 
is  self-starting  in  less  than  8  cycles.  The  outputs  of  Q»,  Qc  and  QF 
are  used  by  the  clock  and  control  circuits.  Using  the  eight  signals 
Q.  to  QH  any  number  of  Burst  generators  can  be  driven.  Each  Burst 
generator  (both  are  shown)  consists  of  8  switches,  8  2-input  OR  gates 
an  8-input  NAND  and  a  D  flip-flop.  These  Burst  generators  generate 
uncompacted  Bursts. 

3.3  The  CPU 

The  actual  arithmetic  functions  of  multiplication  and  division  are 
performed  by  the  circuitry  on  board  2.  This  will  be  illustrated  and 
discussed  in  two  parts:  1)  the  shift  register  and  combinational  logic 
section  and  2)  the  PA  tree. 

The  design  is  basically  the  PA  divider  discussed  in  section  2.4. 
This  has  been  augmented  with  a  few  gates  to  change  easily  between  the 
two  functions.  The  switching  is  performed  by  a  front  panel  switch  which 
supplies  either  5  volts  or  ground  to  the  gates.  During  multiplication 
the  circuit  is  operated  with  the  divider's  combinational  logic  in  place 
and  functioning.  This  logic  is  two  levels  deep  and  does  not  seriously 
degrade  the  multiplier. 

3.3.1  Shift  Register  and  Combinational  Logic 

The  CPU  clock  signal  shown  in  Figure  17  is  used  to  clock  the  shift 
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register  through  which  the  B  Burst  will  flow  (see  Figure  18).  This  Burst 
enters  the  register  through  an  XOR  gate  which  serves  to  pass  or  to  invert 
the  B  Burst  depending  on  the  mode  select  input  (inversion  is  necessary  for 
division).  Other  functions  of  the  mode  select  are  to  inhibit  the  A  Burst 
from  being  fed  directly  into  the  first  PA's  input,  and  to  inhibit  feedback 
from  the  output  of  the  PA  tree. 

To  clarify  the  diagram  of  Figure  18  connections  between  the  shift  regis- 
ter and  the  combinational  logic  are  lettered.  The  use  of  these  letters  is 
local  to  this  one  diagram  and  is  unrelated  to  similar  letters  in  the  Burst 
generator  for  example.  Signals  L  &  M  are  identical.  Two  gates  were  used 
to  prevent  the  fanout  problems  of  driving  14  inputs  in  the  combinational 
logic  section.  The  outputs  of  the  combinational  logic  are  labeled  C,  to 
Cg.  The  corresponding  inputs  are  labeled  similarly  on  the  PA  tree  diagram 
(Figure  19). 

3.3.2  PA  Tree 

The  PA  tree  occupies  the  second  half  of  board  2.  Complete  PAs  are 
used  rather  than  the  circuits  with  less  than  one  flip-flop  per  full  adder 
shown  in  section  2.6. 

The  mode  select  permits  the  flip-flops  to  be  clocked  by  the  A  Burst  or 
by  the  CPU  clock.  The  circuit  description  and  operation  are  exactly  those 
of  the  designs  in  Chapter  2.  The  final  flip-flop  is  needed  to  hold  the 
last  result  for  feedback  and  to  accurately  toggle  the  counter.  Note  that 
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the  output  signal  is  actually  an  inverted  Burst  signal.  A  NOR  gate, 
shown  on  Figure  15,  generates  the  necessary  transitions  from  a  high 
level  to  low  level  to  toggle  the  counter  whenever  the  output  result  is 
"1"  and  the  CPU  clock  goes  high. 

3.4  Display  Circuitry 

The  display  section  has  three  inputs.  Besides  the  toggle  signal 
mentioned  above,  the  Strobe  and  Reset  signals  generated  by  the  clock 
section  are  used  to  transfer  data  from  the  two  digit  decimal  counter 
to  the  storage  registers  and  to  clear  the  counter.  The  result  is 
converted  to  seven  segment  display  format  with  the  leading  zero  blanked. 
Figure  20  is  a  schematic  of  this  section. 

3.5  Discussion  of  Operation 

The  CPU  will  begin  to  run  immediately  when  power  is  applied.  An 
external  TTL  compatible  clock  must  be  furnished  if  the  EXT  position  of 
the  clock  selector  is  chosen.  Positions  1  to  5  correspond  to  internally 
generated  clock  speeds  of  3.75,  7.5,  15,  30  and  60  Hz.  The  display 
format  labeled  "Burst"  causes  the  unit  to  compute  for  8  clock  cycles  and 
to  drop  into  display  update  mode  for  the  next  8  cycles.   "Super  Burst" 
will  cause  the  unit  to  compute  for  64  clock  cycles  and  to  drop  into 
display  update  mode  for  the  next  8  cycles. 

A  small  dot  in  the  seven  segment  display  will  be  on  whenever  the 
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unit  is  performing  display  update.  This  "wait  light",  so  called  because 
the  CPU  must  wait  for  the  output  to  be  performed,  serves  several  purposes. 
When  the  output  is  not  changing,  it  assures  the  user  that  the  unit 
actually  is  computing.  It  is  also  useful  when  experimenting  with  fractional 
input  Burst  patterns,  such  as  representing  3-1/2  by  3  4  3  4.  The 
switches  that  control  the  input  Bursts  can  be  safely  reset  during  the 
time  that  the  light  is  on. 

If  these  switches  are  changed  at  other  times  the  results  are  unpre- 
dictable. The  slower  clock  speeds  are  useful  for  this  type  of  experi- 
mentation. 

The  display  format  of  the  unit  is  a  decimal  number,  not  octal  as 
might  be  guessed  from  the  8-bit  input  Bursts.  The  interpretation  of  the 
result  must  also  take  into  account  the  fact  that  multiplication  is  scaled 
to  prevent  overflow  in  the  8-bit  output  Burst.  Division  is  not  scaled, 
hence  overflow  occurs  only  upon  devision  by  zero.  Division  by  one  yields 
precisely  the  numerator.  Caution  must  be  exercised  when  interpreting 
the  display  with  fast  external  clocks  as  changing  displays  are  easily  mis- 
read because  of  the  inability  of  the  eye  to  perceive  the  changes. 

Because  it  is  inherent  in  the  design  of  a  PA  tree  to  have  some  data 
stored  and  to  not  be  able  to  define  precisely  when  that  data  will  come  out, 
a  given  setting  of  the  front  panel  switches  may  not  always  yield  the  same 
result  (although  the  average  will  always  be  correct).  It  is  possible,  for 
example,  to  generate  a  value  of  2-2/3  with  2  3  3  2  3  3  one  time  and  with 
2  2  4  2  2  4  the  next. 


4  CIRCUIT  DESCRIPTIONS,  EVALUATIONS  AND  COMPARISONS 

The  PA  circuits  described  in  chapter  2  have  been  presented  as  a 
viable  method  of  performing  arithmetic  in  low  precision,  real  time 
applications.  This  chapter  will  present  comparisons  to  other  Burst 
designs  and  to  the  more  conventional  methods  of  weighted  binary  and 
the  Digital  Differential  Analyzer  (DDA).  These  comparisons  will  in- 
clude design  and  operational  descriptions  and  present  data  on  hardware 
complexity/cost,  aesthetics,  speed  of  operation,  noise  immunity,  and 
trade-offs  between  these  and  other  design  parameters. 

4.1  BSR  Circuits 

The  BSR  was  used  in  the  ramp  and  vernier  encoders  presented  in 
chapter  1.  The  operating  principle  of  the  BSR  is  that  it  is  a  current 
source  or  sink  whose  strength  is  proportional  to  the  number  of  ones 
in  the  shift  register.  Addition  is  performed  by  current  summing  and 
re-encoding  the  result  with  a  ramp  or  vernier  encoder  (see  Figure  21). 
Two  multiplier  designs  are  possible.  A  simple  design  uses  one  BSR 
to  control  the  voltage  source  of  the  second.  This  scheme  is  a  complete 
multiplier  when  the  output  of  the  second  BSR  is  re-encoded.  If  higher 
precision  is  needed  and  is  available,  as  when  the  inputs  come  from  a 
vernier  encoder,  a  time  slip  cascade  multiplier  is  possible.  (1)  This 
design  requires  an  entire  10  block  segment  of  input  to  be  trapped  and 
be  available  for  processing.  Selection  of  bits  is  made  by  the  same 
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vernier  encoder  that  is  encoding  the  result. 

The  BSR  approach  is  conceptually  very   simple,  and  has  good  possi- 
bilities for  IC  versions  of  entire  Burst  processors.  The  design  of 
BSRs  from  discrete  components  is  more  difficult  and  is  probably  the 
reason  for  I  EL  projects  using  PA  designs  instead  of  BSRs. 

BSR  designs  are  quick  to  respond  to  input  changes  and  have  a  uni- 
versal building  block  (the  BSR)  much  as  PA  designs  do.  The  outputs  are 
always  compacted  Bursts  while  the  inputs  may  or  may  not  be  compacted. 
BSR  designs  are  well  adapted  to  designing  digital  filters.  It  is  possible 
to  vary  the  weights  of  the  individual  bits  in  the  BSR  so  that  the 
optimum  n  bit  approximation  to  a  given  window  (e.g.  Hamming)  is  achieved. 
This  can  be  done  more  accurately  with  BSRs  than  with  any  other  method 
described  in  this  chapter. 

An  approximation  of  the  hardware  complexity  using  3  gate  equiva- 
lents per  flip-flop  and  1  gate  equivalent  for  each  current  source 
yields  a  complexity  of  about  160  gates  for  the  adder  and  multiplier 
and  240  gates  for  the  divider.  As  with  all  Burst  methods  there  is  no 
need  for  synchronization  and  noise  immunity  is  high.  Comparison  charts 
are  available  in  section  6  of  this  chapter. 

4.2  PA  Circuits 

A  thorough  description  of  the  PA  arithmetic  circuits  will  not  be 
repeated  here,  but  rather,  unique  characteristics  and  comparisons  will 
be  discussed.  The  comparisons  will  be  made  both  on  the  basis  of 
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compacted  and  uncompacted  Bursts;  i.e.,  results  will  be  given  with  and 
without  a  compactor  running  after  the  PA  circuitry. 

The  PA  adder  is  the  simplest  Burst  adder  with  or  without  the  com- 
pactor. The  circuit  is  approximately  6  gate  equivalents  compared  to 
160  for  BSRs.  The  complexity  is  about  the  same  as  the  sampler  adder 
(see  4.3.1),  but  that  circuit  has  a  significant  drawback. 

One  problem  with  PA  circuits,  especially  the  multipliers  and 
dividers  is  the  storage  of  data  in  the  PA  tree.  For  circuits  with 
complete  PA,  as  in  the  CPU  of  chapter  3,  1/2  bit  per  level  in  the  tree 
can  be  stored.  In  the  tree  with  one  flip-flop  per  level  slightly 
less  than  one  bit  is  stored.  This  storage  can  cause  a  somewhat  higher 
deviation  of  the  output  from  the  expected  value  than  BSR  circuits. 
Computer  simulations  and  experience  with  the  CPU  have  shown  that  this 
deviation  is  a  very  complex  function  of  the  inputs  and  the  condition  of 
the  tree  prior  to  the  operation.  For  applications  which  may  be  sensi- 
tive to  such  deviations  longer  averaging  at  the  output  completely  elimi- 
nates this  phenomenon  at  the  expense  of  some  high  frequency  response. 

The  complexity  of  the  PA  multiplier  and  divider  are  lower  than 
any  other  Burst  method  (uncompacted  Bursts)  and  for  the  low  cost  de- 
signs with  one  flip-flop  per  level,  the  complexity  is  the  same  as  weight- 
ed binary  of  the  same  precision  (16  bit  PA  tree  compared  to  4  bit 
binary),  and  is  capable  of  comparable  speeds. 

4.3  Other  Logic  Design  Approaches  to  Bursts 

Logic  design  approaches  to  Burst  arithmetic  other  than  the  PA 
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approach  have  been  made.  BURSTCALC  has  been  constructed  to  demonstrate 
some  of  these  ideas.  The  concept  design  for  a  low  cost  Burst  CPU 
(LOCOBURST)  was  presented  by  Dr.  Poppelbaum  in  March,  1976.  Both  of 
these  approaches  are  now  described. 

4.3.1  BURSTCALC 

BURSTCALC  (5)  contains  a  Burst  adder,  subtractor  and  multiplier/di- 
vider. The  principle  of  operation  of  the  adder  is  known  as  the  selector 
adder.  The  two  input  Bursts  are  alternately  sampled  to  determine  the 
next  output  pulse.  In  an  effort  to  eliminate  problems  of  correlations 
between  this  sampling  scheme  and  the  input  Bursts,  the  sampler  changes 
phase  ewery   Burst.  This  scheme  solves  the  problem  for  constant  Bursts, 
but  any  scheme  which  discards  data  (this  discards  half)  can  have 
errors  due  to  correlations  between  the  discard  pattern  and  the  data. 

The  subtractor  uses  a  shift  register  to  obtain  the  value  of  one 
Burst.  It  then  shifts  in  the  opposite  direction  to  perform  the  sub- 
traction of  the  other  Burst.  Underflow  is  handled  by  changing  the 
sign  and  shifting  data  back  into  the  register  after  the  sign  change. 
If  two  such  units  are  used  and  are  one  Burst  length  out  of  phase  no 
data  is  lost. 

The  multiplier/divider  uses  the  carousel  register  scheme  shown  in 
Figure  22.  Multiplication  requires  a  super  Burst  to  complete  the  cal- 
culation. In  each  Burst  of  the  multiplier  a  different  bit  position  is 
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checked.  If  the  position  so  checked  contains  a  one,  the  multiplicand 
is  added  into  a  reconstruction  register.  During  one  super  Burst  every 
bit  position  in  the  multiplier  is  checked,  so  the  reconstruction  registers 
will  contain  the  product.  This  must  be  reformated  to  be  trans- 
mitted as  a  super  Burst.  It  is  imperative  that  the  inputs  be 
compacted  Bursts  and  that  the  amount  of  deviation  of  Burst  sizes 
be  minimal  if  correlation  problems  are  again  to  be  ignored.  Note 
that  nine  out  of  ten  data  bits  of  the  multiplier  are  discarded. 
This  multiplier  is  also  relatively  slow  to  respond  to  changes  in 
the  inputs.  An  advantage  which  this  multiplier  shares  with  the  PA 
multiplier  is  that  most  of  the  multiplier  circuitry  can  be  used  in 
the  divider. 

The  divider  works  on  a  system  of  comparing  the  length  of  the 
Burst  in  the  divisor  register  with  the  length  of  the  dividend  Burst 
assembled  in  another  register.  When  the  two  lengths  are  equal,  the 
dividend  is  cleared  and  a  one  is  added  to  the  reconstruction  register. 
The  input  Bursts  must  be  compacted  and  the  division  should  be  quite 
constant.  If  uncompacted  Bursts  are  satisfactory  as  output,  the 
reconstruction  circuitry  can  be  eliminated.  This  divider  responds 
quickly  to  changes  of  input  if  used  without  the  reconstructor.  Note 
that  a  noise  pulse  can  confuse  this  divider  as  the  leading  and  trail- 
ing edges  of  the  Burst  control  the  operation  of  the  divider  as  they 
circulate  in  the  registers. 

4.3.2  LOCOBURST 

The  ideas  and  goals  of  LOCOBURST  are  very  different  from  the 
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other  Burst  designs  discussed.  It  is  intended  to  be  a  Burst  CPU 
rather  than  just  a  single  adder,  etc.  The  heart  of  this  design  is 
the  compactor/repeater  shown  in  Figure  23.  Half  of  this  circuit  is 
accepting  bits  and  forming  a  compacted  Burst  while  the  other  half  is 
circulating  and  transmitting  its  contents.  This  same  Burst  will 
circulate  and  be  transmitted  until  the  other  Burst  is  ready.  The 
time  for  this  to  happen  can  vary  from  a  Burst  or  two  to  ten  or  more 
depending  on  the  function  being  performed  and  the  numbers  involved. 

A  \/ery   distinct  advantage  that  this  circuit  has  compared  to  the 
others  is  that  it  eliminates  the  persistant  Burst  problem  of  sum 
of  products  not  equaling  product  of  sums  and  sum  of  quotients  not 
equaling  quotient  of  sums.  This  is  accomplished  by  performing  an 
averaging  before  performing  the  arithmetic. 

The  addition  scheme  is  essentially  the  same  as  the  selector  adder 
of  BURSTCALC.  The  compactor/repeater  is  involved  in  the  role  of  com- 
pacting these  results.  There  are  several  clocking  type  signals  involv- 
ed, including  the  selector  signal  for  the  adder  and  a  vernier  type 
signal  which  is  low  for  9  clock  cycles,  then  high  for  two  cycles. 
Figure  24  shows  the  entire  L0C0BURST  arithmetic  unit.  The  selector 
signal  is  labeled  "s"  and  the  vernier  signal  "e"  in  this  figure. 
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4.4  Weighted  Binary 

This  comparison  to  weighted  binary  is  included  to  show  how  Burst 
processing  in  general  and  the  PA  techniques  in  particular  compare 
with  more  conventional  techniques.  Weighted  binary  is  a  much  more 
compact  representation  than  Bursts.  It  does,  however,  require  synchro- 
nization in  order  to  assign  the  proper  weighting  factor  to  each  bit. 
It  also  has  no  inherent  noise  tolerance.  Error  correcting  codes  can 
improve  noise  tolerance,  but  these  will  not  be  discussed  here. 

The  binary  adder  is,  of  course,  very  similar  to  the  PA,  and  con- 
sists of  a  full  adder  and  a  flip-flop.  Because  of  their  similarity, 
the  comparison  of  the  PA  to  the  binary  adder  is  just  a  comparison  of 
the  two  representations.  The  binary  adder  can  process  more  data  be- 
cause of  the  more  compact  representation,  but  in  the  presence  of  noise 
the  PA's  answer  will  usually  be  more  accurate. 

The  weighted  binary  multiplier  used  in  these  comparisons  is  shown  in 
Figure  25.  Binary  multiplication  will  always  yield  more  precision 
than  contained  in  the  input.  This  multiplier  truncates  the  low 
order  bits  to  make  the  output  compatible  with  the  input.  No  clocking 
nor  control  circuitry  is  shown.  A  four  bit  design  is  shown  since  it 
is  easily  compared  to  the  Burst  designs.  The  complexity  in  hardware  is 
directly  proportional  to  the  number  of  bits  used. 

Weighted  binary  division  require  more  hardware  than  the  multiplier. 
The  principal  addition  is  a  magnitude  detector  to  determine  whether 
subtraction  is  possible.  The  divider,  of  course,  uses  a  subtractor 
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in  place  of  the  adder. 

4.5  Other  Arithmetic  Processors 

The  comparison  of  Bursts  to  such  techniques  as  the  Digital 
Differential  Analyzer  (DDA)  (6)  and  the  Binary  Rate  Multiplier  (BRM)(7) 
have  been  attempted,  but  because  of  the  specialized  uses  of  these 
circuits  a  direct  comparison  is  not  possible.  The  principal  function 
of  the  DDA  and  BRM  is  as  an  integrator.  When  these  devices  are  used 
for  simple  multiplication  or  addition,  hardware  is  being  under-utilized. 
To  attempt  to  use  Burst  hardware  to  perform  integration  is  difficult 
because  the  dynamic  range  of  number  representation  with  Bursts  is  in- 
sufficient. 

4.6  Summary 

This  section  consists  of  a  series  of  tables  that  compare  various 
features  of  adders,  multipliers  and  dividers  for  the  various  process- 
ing techniques  discussed. 
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5  CONCLUSIONS 

The  PA  has  been  shown  to  be  a  very   competitive  (in  many  respects 
the  best)  way  to  perform  arithmetic  on  data  in  the  Burst  format.  Even 
with  the  large  amount  of  redundancy  compared  to  weighted  binary  the  PA 
circuits  compare  favorably  in  both  complexity  and  speed  of  operation 
with  weighted  binary  circuits. 

The  major  advantages  over  other  logic  design  approaches  to  Burst 
processing  are  the  ability  to  use  the  data  on  line  and  to  respond  very 
quickly  to  changes  in  that  data.  An  additional  advantage  is  that  the 
PA  designs  do  not  ignore  any  of  the  incoming  data  making  the  assumption 
that  the  ignored  data  is  identical  to  the  last  data  received. 

As  with  all  Burst  designs  other  than  LOCOBURST,  there  is  the  per- 
sistent problem  that  the  sum  of  products  does  not  equal  the  product  of 
sums.  If  the  rate  of  change  of  the  input  is  small,  this  is  not  a  serious 
problem. 

The  PA  designs  presented  are  easy  to  construct  because  only  readily 
available  parts  are  needed  and  the  parts  count  is  quite  low.  Since  the 
entire  calculation  is  done  with  logic  the  parts  matching  and  adjustment 
problems  of  the  BSR  are  eliminated. 

PA  designs  ushered  in  the  concept  of  the  uncompacted  Burst,  princi- 
pally because  this  design  was  capable  of  accepting  as  well  as  producing 
data  in  that  format.  Uncompacted  Bursts  have  both  advantages  and  dis- 
advantages. The  uncompacted  format  allows  the  processor  to  respond  to 
input  changes  more  quickly  and  is  generally  cheaper.  The  principle  dis- 
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advantage  is  the  loss  of  the  ability  to  do  some  error  correction  before 
the  data  is  processed  (majority  function  decoding,  etc.). 

Both  sign  and  magnitude  and  a  biased  representation  of  signal  num- 
bers have  been  investigated.  Generalized  arithmetic  in  either  system 
nearly  doubles  the  complexity  of  the  hardware,  however,  many  applications 
(those  requiring  only  addition/subtraction  or  only  multiplication/divi- 
sion) can  be  handled  with  little  or  no  increase  in  hardware  complexity. 

The  utility  of  the  PA  designs  has  been  verified  by  the  students  of 
the  Information  Engineering  Laboratory  (IEL)  who  have  chosen  or  adopted 
PA  designs  for  projects  which  required  Burst  arithmetic  processing. 
These  projects  provide  excellent  evidence  that  this  search  for  a  better 
approach  to  Burst  arithmetic  processing  has  been  successful. 
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Appendix  A:  ICs  Used  in  the  CPU 
Board  1 


Position  # 

Type 

Description 

1 

7432 

Quad  2-input  OR 

2 

7432 

Quad  2-input  OR 

3 

7430 

8-input  NAND 

4 
5 
6 

7430 

8-input  NAND 

74121 

Monostable  Flip-flop 

7 

74163 

4-bit  Counter 

8 

7474 

Dual  D  Flip-flop 

9 

74164 

8-bit  Parallel-out  SR 

10 

7430 

8-input  NAND 

11 

7432 

Quad  2-input  OR 

12 

7432 

Quad  2-input  OR 

13 

7446 

7-segment  Decoder/ Driver 

14 

7475 

4-bit  Latch 

15 

7490 

Decade  Counter 

16 

7402 

Quad  2-input  NOR 

17 

7404 

Hex  inverter 

18 

74163 

4-bit  Counter 

19 

7473 

Dual  J-K  Flip-flop 

20 

7451 

Dual  2-wide  2-input  AOI 

21 

7405 

Open-collector  Hex  Inverter 

22 

7490 

Decade  Counter 

23 

7475 

4-bit  Latch 

24 

7446 

7-segment  Decoder/Driver 
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Board  2 


Position  # 

Type 

1 

7486 

2 

7430 

3 

74164 

4 

7430 

5 

7430 

6 

7430 

7 

7400 

8 

7400 

9 

7420 

10 

7430 

11 

7400 

12 

7400 

13 

7400 

14 

7474 

15 

74H183 

16 

7474 

17 

74H183 

18 



19 



20 

74H183 

21 

7474 

22 

74H183 

23 

7474 

24 

7402 

Description 

Quad  2-input  XOR 
8- input  NAND 
8-bit  Parallel-out  SR 
8-input  NAND 
8-input  NAND 
8-input  NAND 
Quad  2-input  NAND 
Quad  2-input  NAND 
Dual  4-input  NAND 
8-input  NAND 
Quad  2-input  NAND 
Quad  2-input  NAND 
Quad  2-input  NAND 
Dual  D  Flip-flop 
Dual  Full  Adder 
Dual  D  Flip-flop 
Dual  Full  Adder 


Dual  Full  Adder 
Dual  D  Flip-flop 
Dual  Full  Adder 
Dual  D  Flip-flop 
Quad  2-input  NOR 
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Appendix  B:  Edge  Connector  Pin  Assignments 
Board  1 


Pin  # 

1-8 

9-16 

17 

18 

19 

20 

30 
31 
32 
33 
34 
35 

60 
61 

64 

65-71 

72-78 

79 

80 


Pin  # 

1 
2 
3 
4 

78 
79 
80 


Board  2 


Description 

Switches  for  "A"  Burst 
Switches  for  "B"  Burst 
Master  clock  from  selector 
Display  format  switch 
DATA  from  CPU  board 
Clock  to  CPU 

A.C.  ripple  from  power  supply 


Clock  1 

to 

selector 

Clock  2 

to 

selector 

Clock  3 

to 

selector 

Clock  4 

to 

selector 

Clock  5 

to 

selector 

B  Burst 

to 

CPU 

A  Burst 

to 

CPU 

Wait  Light 

Seven  segment  units  display 

Seven  segment  tens  display 

Ground 


cc 


(+5  volts) 


Description 

CPU  clock  in 
Mode  select  switch 
Burst  A  in 
Burst  B  in 


DATA  result  out 

Ground 

V    (+5  volts) 
cc 


VITA 
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